Learning Effective Embeddings from Medical Notes

نویسندگان

  • Sebastien Dubois
  • Nathanael Romano
چکیده

With the large amount of available data and the variety of features they offer, electronic health records (EHR) have gotten a lot of interest over recent years, and start to be widely used by the machine learning and bioinformatics communities. While typical numerical fields such as demographics, vitals, lab measurements, diagnoses and procedures, are natural to use in machine learning models, there is no consensus yet on how to use the free-text clinical notes. We show how embeddings can be learned from patients’ history of notes, at the word, note and patient level, using simple neural and sequence models. We show on various relevant evaluation tasks that these embeddings are easily transferable to smaller problems, where they enable accurate predictions using only clinical notes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Word Embeddings for the Biomedical Natural Language Processing

Background Neural word embeddings have been widely used in biomedical Natural Language Processing (NLP) applications as they provide vector representations of words capturing the semantic properties of words and the linguistic relationship between words. Many biomedical applications use different textual resources (e.g., Wikipedia and biomedical articles) to train word embeddings and apply thes...

متن کامل

Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation

Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identif...

متن کامل

A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings

Learning word embeddings has received a significant amount of attention recently. Often, word embeddings are learned in an unsupervised manner from a large collection of text. The genre of the text typically plays an important role in the effectiveness of the resulting embeddings. How to effectively train word embedding models using data from different domains remains a problem that is underexp...

متن کامل

Assessing the Readability of Medical Documents: A Ranking Approach

BACKGROUND The use of electronic health record (EHR) systems with patient engagement capabilities, including viewing, downloading, and transmitting health information, has recently grown tremendously. However, using these resources to engage patients in managing their own health remains challenging due to the complex and technical nature of the EHR narratives. OBJECTIVE Our objective was to d...

متن کامل

Learning Low-Dimensional Representations of Medical Concepts

We show how to learn low-dimensional representations (embeddings) of a wide range of concepts in medicine, including diseases (e.g., ICD9 codes), medications, procedures, and laboratory tests. We expect that these embeddings will be useful across medical informatics for tasks such as cohort selection and patient summarization. These embeddings are learned using a technique called neural languag...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017